Clustering using Positional Association Rules Algorithm on Protein Sequence Motifs
نویسندگان
چکیده
In the field of Bioinformatics, protein sequence motif analysis has the potential to discover the role and function of specific proteins. Many algorithms or techniques to discover motifs require a predefined fixed window size in advance. Due to the fixed size, these approaches often deliver a number of similar motifs simply shifted by some bases or including mismatches. For this problem, earlier work in this field has produced a new Association Rule aptly named the Positional Association Rule, which searches for frequent distances appearing among motifs and motifs. In this work we adapt this newly proposed association rules algorithm to “cluster” the protein sequence motifs to solve the problem generated by fixed window size approach. Although association rule based algorithms have been widely adapted in association analysis and classification, few of those are designed as clustering methods. We believe this paper will provide future research a novel way to find patterns and associations when dealing with protein sequences or similar applications.
منابع مشابه
Discovery and Extraction of Protein Sequence Motif Information that Transcends Protein Family Boundaries
Protein sequence motifs are gathering more and more attention in the field of sequence analysis. The recurring patterns have the potential to determine the conformation, function and activities of the proteins. In our work, we obtained protein sequence motifs which are universally conserved across protein family boundaries. Therefore, unlike most popular motif discovering algorithms, our input ...
متن کاملApplying a decision support system for accident analysis by using data mining approach: A case study on one of the Iranian manufactures
Uncertain and stochastic states have been always taken into consideration in the fields of risk management and accident, like other fields of industrial engineering, and have made decision making difficult and complicated for managers in corrective action selection and control measure approach. In this research, huge data sets of the accidents of a manufacturing and industrial unit have been st...
متن کاملMining the Banking Customer Behavior Using Clustering and Association Rules Methods
The unprecedented growth of competition in the banking technology has raised the importance of retaining current customers and acquires new customers so that is important analyzing Customer behavior, which is base on bank databases. Analyzing bank databases for analyzing customer behavior is difficult since bank databases are multi-dimensional, comprised of monthly account records and daily t...
متن کاملA Method for Automatically Finding Structural Motifs in Proteins
A lot of work has gone into predicting the secondary (small scale) structure of proteins from their amino acid sequence. Current research indicates that there are limits on how well secondary structure can be predicted from local sequence information. To further advance prediction, the interactions between elements of secondary structure which are inherently non-local, have to be better underst...
متن کاملRetaining Customers Using Clustering and Association Rules in Insurance Industry: A Case Study
This study clusters customers and finds the characteristics of different groups in a life insurance company in order to find a way for prediction of customer behavior based on payment. The approach is to use clustering and association rules based on CRISP-DM methodology in data mining. The researcher could classify customers of each policy in three different clusters, using association rules. A...
متن کامل